- Article
STM-Net: A Multiscale Spectral–Spatial Representation Hybrid CNN–Transformer Model for Hyperspectral Image Classification
- Yicheng Hu,
- Jia Ge and
- Shufang Tian
Hyperspectral images (HSIs) have been broadly applied in remote sensing, environmental monitoring, agriculture, and other fields due to their rich spectral information and complex spatial properties. However, the inherent redundancy, spectral aliasing, and spatial heterogeneity of high-dimensional data pose significant challenges to classification accuracy. Therefore, this study proposes STM-Net, a hybrid deep learning model that integrates SSRE (Spectral–Spatial Residual Extraction Module), Transformer, and MDRM (Multi-scale Differential Residual Module) architectures to comprehensively exploit spectral–spatial features and enhance classification performance. First, the SSRE module employs 3D convolutional layers combined with residual connections to extract multi-scale spectral–spatial features, thereby improving the representation of both local and deep-level characteristics. Second, the MDRM incorporates multi-scale differential convolution and the Convolutional Block Attention Module mechanism to refine local feature extraction and enhance inter-class discriminability at category boundaries. Finally, the Transformer branch equipped with a Dual-Branch Global-Local (DBGL) mechanism integrates local convolutional attention and global self-attention, enabling synergistic optimization of long-range dependency modeling and local feature enhancement. In this study, STM-Net is extensively evaluated on three benchmark HSI datasets: Indian Pines, Pavia University, and Salinas. Additionally, experimental results demonstrate that the proposed model consistently outperforms existing methods regarding OA, AA, and the Kappa coefficient, exhibiting superior generalization capability and stability. Furthermore, ablation studies validate that the SSRE, MDRM, and Transformer components each contribute significantly to improving classification performance. This study presents an effective spectral–spatial feature fusion framework for hyperspectral image classification, offering a novel technical solution for remote sensing data analysis.
Remote Sens.,
14 December 2025


